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Abstract 

Background: Polycomb group proteins form multicomponent complexes that are important for establishing 
lineage-specific patterns of gene expression. Mammalian cells encode multiple permutations of the prototypic 
Polycomb repressive complex 1 (PRCl) with little evidence for functional specialization. An aim of this study is to 
determine whether the multiple orthologs that are co-expressed in human fibroblasts act on different target genes 
and whether their genomic location changes during cellular senescence. 

Results: Deep sequencing of chromatin immunoprecipitated with antibodies against CBX6, CBX7, CBX8, RINGl and 
RING2 reveals that the orthologs co-localize at multiple sites. PCR-based validation at representative loci suggests 
that a further six PRCl proteins have similar binding patterns. Importantly, sequential chromatin immunoprecipitation 
with antibodies against different orthologs implies that multiple variants of PRCl associate with the same DNA. At 
many loci, the binding profiles have a distinctive architecture that is preserved in two different types of fibroblast. 
Conversely, there are several hundred loci at which PRCl binding is cell type-specific and, contrary to expectations, the 
presence of PRCl does not necessarily equate with transcriptional silencing. Interestingly, the PRCl binding profiles are 
preserved in senescent cells despite changes in gene expression. 

Conclusions: The multiple permutations of PRCl in human fibroblasts congregate at common rather than specific 
sites in the genome and with overlapping but distinctive binding profiles in different fibroblasts. The data imply that 
the effects of PRCl complexes on gene expression are more subtle than simply repressing the loci at which they bind. 



Background 

Polycomb-group (PcG) proteins were originally identi- 
fied by their developmental effects in Drosophila but 
their mammalian counterparts are implicated in a wide 
variety of physiological processes, including pluripo- 
tency, imprinting, differentiation and senescence [1-3]. 
Senescence is a state of permanent cell cycle arrest that 
occurs in response to various forms of cellular stress and 
acts as a front-line defense against potentially oncogenic 
lesions [4,5]. Genetic ablation of a number of PcG genes 
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results in premature senescence, in part because of de- 
repression of the CDKN2A tumor suppressor [6,7]. 

PcG proteins operate within multi-component com- 
plexes, the best characterized of which are termed Poly- 
comb repressive complexes 1 and 2 (PRCl and PRC2) 
[1-3]. PRC2 is responsible for tri-methylating histone H3 
on lysine 27 (H3K27me3), generally considered to be a 
mark of transcriptional repression, and apart from dupli- 
cation of the Ezh subunit, the core components of the 
complex are conserved between flies and mammals [8]. 

In contrast, the PRCl components have undergone con- 
siderable expansion during evolution. The prototypic PRCl 
complex in Drosophila comprises equimolar amounts of 
Polycomb (Pc), Posterior sex combs (Psc), Polyhomeotic 
(Ph) and Sex combs extra (See) [9,10]. However, as human 
cells encode five Pc, six Psc, three Ph and two See orthologs. 
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there are multiple variants of PRCl [2,8,11]. Biochem- 
ical analyses indicate that a typical PRCl complex con- 
tains a single representative from each gene family 
[7,12-15] and, although it has not been formally shown 
that all of the orthologs participate in functional PRCl- 
type complexes, there could in principle be up to 180 
different permutations. The situation is further compli- 
cated by the ability of the Psc and See subunits to form 
functional complexes that contain RYBP or YAF2 ra- 
ther than the Pc and Ph orthologs [15-17]. 

The Psc and See subunits combine to form an E3 ubi- 
quitin ligase that mono-ubiquitinates histone H2A on ly- 
sine 119 (H2AK119), a modification that is also associated 
with transcriptional repression [1-3]. However, it has not 
been conclusively established whether the H2AK119 
modification is essential for blocking transcription and 
whether it is carried out by the prototypic PRCl or alter- 
native complexes [15-20]. 

The reason for the expansion of PRCl families is unclear 
as there is currently little evidence for functional diversifi- 
cation [15,21-23]. It has been generally assumed that par- 
ticular family members operate in different cell lineages or 
that the various permutations of PRCl regulate specific 
sets of target genes [15,24]. For example, CBX7 is the 
predominant Pc ortholog in embryonic stem (ES) cells 
but its expression decUnes during differentiation and it 
is replaced by CBX2, CBX4 and CBX8 [24,25]. If these 
proteins bind selectively, as suggested, then the number 
of PcG target genes would potentially increase in cell 
types that express multiple variants of the canonical 
PRCl complex. 

Here we investigate the genome-wide distribution of the 
PRCl components expressed in two strains of normal hu- 
man fibroblasts (HFs), a somatic cell type required for the 
physical integrity of multiple tissues and the classical 
model for the study of cellular senescence. Contrary to ex- 
pectations, we find that three Pc orthologs (CBX6, 7 and 
8) and both See proteins (RINGl and RING2) have very 
similar binding patterns throughout the genome with little 
evidence for target gene specificity. Moreover, by assem- 
bling a panel of antibodies that support chromatin immu- 
noprecipitation (ChIP) of the endogenous proteins, we 
find that 11 of the PRCl core components expressed in 
HFs display virtually identical binding patterns at repre- 
sentative loci. Importantly, sequential ChIP with anti- 
bodies against different members of the Pc, Ph and See 
families implies that multiple variants of PRCl associate 
simultaneously with the same chromatin. At many loci, 
the binding patterns have an inherent architecture that is 
faithfully conserved between the two strains of HF but at 
an equivalent number of loci the binding is strain specific. 
Surprisingly, there is no strict correlation between the 
presence and position of the PRCl peaks and RNA ex- 
pression from adjacent gene(s). Moreover, the genomic 



location of the Pc orthologs remains unchanged in senes- 
cent HFs. The findings are consistent with the idea that 
mammalian PRCl proteins congregate as complexes of 
complexes and that their control over gene expression is 
more subtle than simply blocking transcription of the 
genes to which they bind. 

Results 

Expression of PRC1 genes in human fibroblasts 

Most studies of mammalian PcG proteins and their poten- 
tial target genes have focused on ES cells and in vitro 
models of differentiation. We chose to investigate the situ- 
ation in HFs, in part because of the evidence linking PRCl 
proteins with senescence, via effects on the CDKN2A 
locus [26], but also because HFs express a much broader 
repertoire of PRCl genes than ES cells. For example, deep 
sequencing of RNA isolated from the BF strain of adult 
breast fibroblasts and the Hs68 strain of neonatal foreskin 
fibroblasts suggested that 15 of the 16 genes encoding core 
PRCl components are expressed, albeit at variable levels 
(Figure lA). Among the four gene families, the highest ex- 
pression was observed for CBX4, MEL18, HPH2 and 
RINGl. In contrast to mouse embryo fibroblasts [24,25], 
the HFs expressed substantial amounts of CBX7 whereas 
CBX2 was barely detectable at the RNA level. 

Multiple PRCl complexes bind simultaneously at the 
CDKN2A locus 

The RNA expression data implied that HFs have the po- 
tential to express many different permutations of PRCl 
and are therefore an ideal system in which to investigate 
target specificity. Using the CDKN2A locus as a test bed 
for antibody evaluation, we assembled a panel of custom- 
made and commercially available reagents that support 
Chip of endogenous PRCl components in primary HFs. 
A number of PRCl components have been previously 
located at CDKN2A using a validated set of PCR primers 
(Figure IB and [7,12,27]) but our survey has added 
substantially to the list of proteins involved. We found 
that four Pc proteins (CBX4, 6, 7 and 8), two Psc proteins 
(BMIl and MEL18), all three Ph orthologs (HPHl, 2 and 
3) and both See proteins (RINGl and 2) were not only 
enriched at the CDKN2A locus but had virtually identical 
binding profiles (Figure IC). Importantly, we confirmed 
that the antibodies did not cross-react with members of 
the same family, even when they were over-expressed 
(Additional file 1: Figure SI). 

Genome-wide profiling of PRCl binding sites 

To determine whether PRCl proteins are co-localized at 
other target genes, we conducted genome-wide ChlP- 
sequencing (ChlP-seq) with antibodies against three of 
the Pc orthologs and both See orthologs for the BF and 
Hs68 fibroblasts. We reasoned that by comparing the 
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(See figure on previous page.) 

Figure 1 Multiple PRC1 components have similar binding profiles at the CDKN2A locus. (A) Relative expression levels of the core PRCl 
components in the BF and Hs68 strains of primary fibroblasts. The numbers are mean FPKM values from four independent RNA-sequencing 
analyses. CDK4 and TUBB are shown as controls. (B) Genomic organization of the human CDKN2A/B locus with colored boxes depicting the exons 
that encode pl6'^^^^ pi 4^^^ and pi 5'^^^^. The locations of primer sets used for ChIP analyses are shown below, numbered according to Bracken 
et ol. [27]. The sequences are described in Additional file 6: Table S3. (C) Examples of ChIP data using antibodies against four Pc proteins (blue), 
two Psc proteins (pink), three Ph orthologs (orange) and both See proteins (green) using the indicated primer sets. Binding is shown as the 
percentage of input. Non-specific antibodies or pre-immune serum were used as controls as appropriate (shown in grey). (D) Screenshot showing 
DNA sequence tag densities at the CDKN2A locus following ChlP-sequencing with the indicated antibodies in BF fibroblasts. The maximum 
coverage for each track is shown on the left and a size bar is included above the IGV screenshot. ChIP, chromatin immunoprecipitation; kb, 
kilobase; FPKM, fragments per kilobase of gene per million fragments mapped; IGV, Integrative Genomics Viewer. 



binding profiles of CBX6, CBX7 and CBX8, we could as- 
certain whether they regulate distinct or overlapping sets 
of target genes in HFs. Conversely, as every catalytically 
active PRCl complex should contain either RINGl or 
RING2, the combined dataset should provide an impres- 
sion of all PRCl binding sites in the genome, which would 
complement the CBX profiles. For reference purposes, 
we also generated ChlP-seq profiles for H3K27me3 and 
H3K4me3 for the same cells. CDKN2A is shown as an ex- 
ample in Figure ID. Sequencing was performed on the 
Illumina GAIIx platform and typically generated between 
12 and 60 million 36-bp reads that could be mapped to 
the hgl9 release of the human genome. The raw and proc- 
essed data have been deposited under GEO accession 
number [GEO:GSE40740] [28]. 

Unlike the inferences drawn from other cell systems 
[15,24], the five PRCl proteins showed remarkably simi- 
lar binding patterns at multiple sites throughout the 
genome (Figure 2). However, the bioinformatic interpret- 
ation was complicated by the fact that peak-calling algo- 
rithms, such as MACS [29], often identified different 
numbers of peaks at particular loci depending on the 
density of the sequence reads for each antibody. To try 
to clarify the situation, the ChlP-seq was repeated with 
chromatin prepared from independent cell populations 
rather than simply increasing the number of reads. 
Merging of replicate datasets improved the definition 
of coincident binding sites but did not completely re- 
solve the issue of peak calling. For example, it was 
common to find regions where a robust peak for all 
five PRCl proteins was flanked by weaker peaks where 
only some of the proteins were scored by MACS (for 
example, GATA6, UNCX and NKX2-3 in Figure 2). We 
interpret these 'foothilF peaks as representing weaker 
or less frequent contacts at the fringes of the main 
body rather than evidence for specific binding by sub- 
sets of PRCl proteins. 

Validation of PRCl complex binding at representative loci 

To support our interpretation of the ChlP-seq data, we 
selected representative examples of target loci and per- 
formed conventional ChIP followed by quantitative 
PCR analyses with a series of primers that not only 



crossed each peak but allowed discrimination between 
regions of high and low/no enrichment (Additional file 
2: Figure S2). The PCR-based assays confirmed the re- 
sults of the ChlP-seq but additionally showed that, 
where tested, all of the PRCl proteins for which we 
have suitable antibodies had virtually identical binding 
profiles. These findings are hard to reconcile with the 
idea that different permutations of PRCl bind selectiv- 
ity at different target genes. 

Multiple variants of PRC1 associate with the same DNA 

As the Chip data implied that multiple variants of PRCl 
were occupying very similar positions in the genome, it 
was important to know whether they were simultan- 
eously associated with the same chromatin. To address 
this question, we performed sequential ChIP with anti- 
bodies against different members of the Pc, Ph and See 
families. For example, chromatin precipitated with an 
antibody against CBX6 was eluted from the Protein-A 
beads and re-precipitated with antibodies against CBX7, 
CBX8 or an irrelevant IgG control. The recovered chro- 
matin was then interrogated with PCR primers that 
demonstrate localized enrichment at CDKN2A, and at 
other loci (Figure 3 and data not shown). Although the 
analyses were not exhaustive and do not indicate the 
proportion of each protein that is sequentially precipi- 
tated, the data infer that the three CBX proteins can be 
associated with the same DNA, and that this holds true 
for the three HPH proteins and both RING proteins. As- 
suming that the canonical PRCl complex contains a sin- 
gle representative from each of the component families, 
and that they can freely assort, the data imply that mul- 
tiple permutations of PRCl congregate at common sites 
throughout the genome. 

Location of PRCl binding sites relative to potential 
target genes 

Irrespective of the number of PRCl proteins detected 
at any one site, an obvious point of interest was the rela- 
tion between the binding profiles and potential target 
genes. We compiled a list of candidate loci at which ChlP- 
seq detected at least one CBX, one RING protein and 
H3K27me3, using the position of the closest transcription 
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(See figure on previous page.) 

Figure 2 Representative examples of ChlP-sequencing data at annotated loci. Tlie panels sliow DNA sequence tag densities observed 
following ChlP-sequencing with the indicated antibodies at GATA6 (A), MEISl (B), UNCX (C), ZIC4 (D) and NKX2-3 (E), in the BF strain of HFs. Panels 
(F) and (G) show the equivalent ChlP-sequencing data for the TLX2 and LBXl loci in both BF and Hs68 cells. The maximum coverage for each 
track is shown on the left and a size bar is included above the IGV screenshot. The genomic organization of each locus according to the UCSC 
genome browser is shown below, with arrows indicating the direction of transcription. HF, human fibroblast; kb, kilobase; IGV, Integrative 
Genomics Viewer. 



start site (TSS) as the initial criterion. However, a striking 
feature of the data was that many of the ChlP-seq peaks 
had a complicated architecture that would be inconsistent 
with a normal distribution around a single binding site. It 
was also common to find multiple discrete peaks at the 
same locus and there was no simple relation between the 
number of peaks identified by MACS and the number of 



potential target genes. The bioinformatically compiled lists 
were therefore edited manually to try to resolve ambigu- 
ities (Additional file 3: Table SI). For example, where 
peaks overlapped adjacent coding and non-coding loci, 
preference was given to the annotated coding gene. 

Most published descriptions of PRCl and H3K27me3 
profiles infer that they are localized in relatively discrete 
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Figure 3 Sequential ChIP of PRCl orthologs at the CDKN2A locus. (A) Sequential ChIP was performed with antibodies against CBX8 followed 
by CBX6 and CBX7, or CBX6 followed by CBX7 and CBX8 as indicated. An irrelevant antibody was used as a control (shown in grey). The 
recovered chromatin was subjected to quantitative PCR with primers that detect specific PRCl enrichment at CDKN2A (as described in Figure 1). 
(B and C) Equivalent analyses with antibodies against HPHl, HPH2 and HPH3 (B) or RINGl and RING2 (C). ChIP, chromatin immunoprecipitation. 
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regions within a few kilobase pairs from the TSS. Al- 
though this was a common scenario in our target gene 
list (summarized in Figure 4), the distance from the TSS 
was very variable (up to 620 kb) and in many cases 
exceeded the +/- 10 kb range that is typically used in 
this type of analysis. Importantly, at a substantial num- 
ber of loci the PRCl peaks were much closer to a tran- 
scription end site (TES) than a TSS. For example, the 
simple peak just downstream of the MEISl gene is ap- 
proximately 150 kb from the nearest TSS, which is at 
the 5' end of MEISl and marked by H3K4me3 (Figure 2B 
and Additional file 2: Figure S2C). There is no other an- 
notated locus for 500 kb in either direction. We esti- 
mated that around 15% to 16% of the potential target 
loci had peaks that were most closely associated with the 
3' end of the transcription unit (Figure 4) but acknow- 
ledge that for short genes, this classification is debatable. 

The third group of potential targets, representing 
about 18% of the total, had PRCl peaks associated with 
both ends of the transcription unit. However, within 
this category, there were a number of distinct scenarios 
(Figure 4). Some loci were flanked by relatively discrete 
peaks that did not spread across the body of the gene 
(for example, UNCX in Figure 2C), others had internal 



peaks at some distance from the TSS or TES, but a more 
common pattern was where peaks were located up- 
stream, downstream and within the transcription unit 
(for example, ZICl and ZIC4 in Figure 2D). Where such 
complicated profiles occurred at closely linked genes, it 
was not possible to assign the peaks to specific loci. 

In conclusion, our data suggest that there is an inher- 
ent complexity in the shape of the PRCl peaks and their 
location relative to potential target genes. These intrica- 
cies in the ChlP-seq profiles have been largely ignored 
or masked by bioinformatics-based meta-analyses. 

Comparison of the PRC1 binding profiles in different 
fibroblasts 

Despite some residual ambiguities, the peak classifica- 
tion scheme yielded very similar conclusions regarding 
the numbers of putative target genes in the two strains 
of HF and the proportions in each category (Figure 4A). 
A total of 986 loci were scored as candidate PRCl 
targets in BF cells compared with 1,099 in Hs68 cells 
(Additional file 3: Table SI). Importantly, although there 
were 685 loci in common between the gene lists, 301 
were unique to BF cells and 414 were unique to Hs68 
cells (Figure 4B). 
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At the shared loci, the binding patterns were very simi- 
lar in the two HF strains. This was particularly striking at 
loci where the peaks had a complicated architecture. In- 
deed, many of the peaks appeared to resolve into evenly 
spaced arrays, suggesting some degree of periodicity (for 
example, TLX2 and LBXl in Figure 2F, G). As the same 
periodicity was observed with antibodies against all five 
PRCl proteins, with independent preparations of chroma- 
tin, and in both strains of HF, it seems unlikely to be a 
technical artefact and warrants further investigation. 

Where the PRCl profiles appeared to be strain specific, the 
differences were generally unambiguous. For example, TBX2 
is a PRCl target in BF but not Hs68 cells, whereas the 
adjacent TBX4 gene is occupied in both (Figure 5 and 
Additional file 4: Figure S3). Similarly, NRNl and RUNX3 are 
targets in Hs68 but not BF cells, whereas MEISl shows the re- 
ciprocal pattern of occupancy (Additional file 2: Figure S2 and 
Additional file 4: Figure S3). Among the most striking exam- 
ples of strain-specific profiles are those for the four HOX gene 
clusters (Additional file 5: Figure S4). For example, whereas 
the PRCl proteins occupy the posterior but not the anterior 
regions of the HOX gene clusters in BF cells, the pattern in 
Hs68 cells is more complicated, with interspersed binding 
and non-binding domains across the entire locus. 

Relating PRCl occupancy to gene expression 

Although most of the PRCl-occupied loci were devoid of 
the H3K4me3 mark, consistent with transcriptional re- 
pression, a substantial proportion (38%) had an H3K4me3 
peak associated with the TSS (for example, TBX2 and 
TBX4 in Figure 5). To explore the relation between PRCl 
occupancy and gene expression more directly, we used 
strand-specific RNA-sequencing (RNA-seq) to profile the 
transcriptome in BF and Hs68 cells and cross-referenced 
the resulting datasets with the list of potential target genes 
compiled from the ChlP-seq. The analyses were con- 
ducted on two independently prepared RNA samples from 
each strain of HF, and an arbitrary threshold of ten read 
counts in duplicate samples was taken as evidence of ac- 
tive transcription (Additional file 3: Table SI). On this 
basis, 28% of the PRCl target genes in each cell type were 
deemed to be transcriptionally active, whereas 72% were 
considered silent (Figure 4C). This compares with a 47% 
to 53% ratio for all expressed and non-expressed genes. 
At representative loci selected for validation, there was 
excellent agreement between the results obtained by 
quantitative reverse transcription PCR (qRT-PCR) and the 
RNA-seq data. For example, NRNl and RUNX3 were 
expressed in BF but not Hs68 cells, ISL2 and GATA2 were 
expressed in Hs68 but not BF, whereas MEISl was 
expressed in both, albeit at different levels (Figure 6). An 
obvious implication is that the presence of PRCl at a locus 
does not necessarily equate with repression, although for- 
mal proof would require analyses at the single cell level. 



The genomic landscape of Polycomb binding is 
maintained at senescence 

One of the motivations for profiling Pc orthologs in HFs 
was to understand the role of PRCl complexes in the 
regulation of CDKN2A and senescence. We therefore 
conducted ChlP-seq for CBX7 and CBX8 in Hs68 cells 
that had been to grown to replicative senescence, as 
judged by a failure to double after 6 to 8 weeks in cul- 
ture. As shown in Figure 7A, staining for senescence- 
associated |3-galactosidase (SA-pgal) activity indicated 
that the cell populations were uniformly senescent and 
the anticipated accumulation of pi 6^^^^^^ and decline of 
p-|^giNK4c igygig |-3Q] ^ere confirmed by immunoblotting 

(Figure 7B). Remarkably, the CBX binding landscapes in 
the senescent cells were qualitatively indistinguishable 
from those in proliferating Hs68 cells, in terms of the 
architecture and locations of the peaks (Figure 7C). 
However, there was an overall reduction in the density 
of sequence reads. This was particularly notable for 
CBX7, potentially reflecting the decline in the total levels 
of CBX7 in senescent cells (Figure 7B), as previously 
reported [31]. Thus, although there was no evidence 
for dramatically altered binding profiles, such as those 
observed between the two strains of HFs, there could 
be relatively subtle changes in the amounts of the dif- 
ferent Pc orthologs associated with specific loci. Such 
changes are presumably enough to cause upregulation 
ofpl6~. 

Discussion 

It has been widely assumed that the different variants of 
PRCl encoded by mammalian cells are likely to repress 
different sets of target genes or to substitute for one an- 
other in different contexts. Here we suggest a different 
scenario based on evidence that PRCl complexes act 
collectively rather than individually and can be associ- 
ated with transcriptionally active genes. It is not yet 
known how many of the potential permutations of the 
canonical PRCl complex can be formed in mammalian 
cells as the existing biochemical data do not provide a 
strong case for preferential associations between particu- 
lar subunits. Allowing for free assortment, HFs have the 
potential to express multiple variants of PRCl as judged 
by detection of transcripts for 15 of the 16 core compo- 
nents. We might therefore have expected to see CBX6, 7 
and 8 peaks in discrete locations in the ChlP-seq data, 
in line with recent suggestions [15,24], but instead found 
that they are largely co-localized. Although there were 
sites where subsets of the CBX proteins were detected, 
the signals were generally weaker compared to the loci 
at which all three CBX proteins were detectable. Like- 
wise, we might have expected to see selective binding of 
RINGl and RING2, particularly as they can participate 
in alternative complexes with RYBP and YAF2 but, as 
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others have shown [15-17], they mostly occupied the 
same positions as the CBX proteins. 

As our findings appeared to be odds with recent re- 
ports, we re-examined the published datasets to seek 
a potential explanation, focusing on studies that com- 
pared the binding of different members of the same pro- 
tein family. For example, an extensive survey of 29 
chromatin regulators in the K562 human leukemia cell 



line included data showing that endogenous CBX2 and 
CBX8 had remarkably coincident binding profiles [32]. 
Although not explicitly discussed, the inference is that 
CBX2 and CBX8 have common targets. In contrast, a 
separate study that compared the binding of epitope- 
tagged versions of the Psc family in 293T cells concluded 
that they bind selectively to different categories of target 
genes [15]. Despite the emphasis on selectivity, the data 
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(See figure on previous page.) 

Figure 6 Correlation of ChlP-sequencing and RNA-sequencing in BF and Hs68 cells. (A) Comparison of CBX7 binding profiles (top tracl<s) 
and RNA-sequencing tag densities at tine indicated loci in BF and Hs68 cells. For comparative purposes, the sequence tag densities are plotted on 
equivalent scales. (B) Reverse transcription and quantitative PCR validation of the RNA-sequencing data at the indicated loci. Relative expression 
levels in BF cells (purple) and Hs68 (tan) cells were calculated relative to RPS17 as an internal standard and plotted on a logarithmic scale. The 
color-coded B and H symbols above the graph indicate whether the locus is occupied by PRCl complexes in BF (purple) or Hs68 (tan) cells. Note 
that DKK2, GADl, FGFW, SFRP2, FOXEl and TBX4 are PRCl targets in both cell backgrounds. ChIP, chromatin immunoprecipitation; kb, kilobase; 
RNA-seq, RNA-sequencing. 



clearly indicate that a substantial fraction of the target 
loci were co-occupied by multiple Psc family members. 
Along similar lines, it was reported that, in mouse em- 
bryoid bodies, Cbx2 and Cbx4 bind at both overlapping 
and non-overlapping sets of target genes [24]. The evi- 
dence for specificity was based largely on differences in 
the sequence read densities achieved with different anti- 
bodies and there was considerable variation in the num- 
bers of peaks identified with each reagent. Of note, the 
data imply that 90% of the Cbx2 peaks overlap with 
those of Cbx4. 

Despite their different interpretations, these published 
datasets are consistent with the co-localization of PRCl 
complexes at multiple sites in the genome. Although we 
did not detect CBX2 in HFs, we extended the evidence 
for co-localization to include 11 PRCl components that, 
as judged by ChlP-PCR with multiple primer sets, had 
virtually identical binding patterns at representative loci. 
Importantly, our ability to perform sequential ChIP with 
antibodies specific for different members of the Pc, Ph 
and See families argues against the idea that the com- 
plexes are acting redundantly in different cells. Rather, a 
proportion of these complexes must be simultaneously 
associated with the same DNA. 

While our findings do not exclude an element of se- 
lectivity in the association of specific PRCl complexes 
with specific loci, the overwhelming impression is that 
they congregate at common sites. One possibility is that 
they form multimeric assemblies analogous to the Pc 
bodies described for Drosophila [33-35]. Mammalian Pc 
bodies might comprise multiple permutations of PRCl 
rather than multimers of the same complex. Although 
the ChlP-seq data do not provide a three-dimensional 
picture, the fact that multiple complexes are co-localized 
along the genome implies that they are also co-localized 
in space. To our knowledge, this possibility has not been 
considered in the existing literature and most discus- 
sions about mammalian Pc bodies have centered on their 
visualization by fluorescence. In primary HFs, endogenous 
RINGl, BMIl and CBX4 have been detected as multiple 
speckles that are evenly distributed throughout the cell 
nucleus, similar to the situation in Drosophila [33,36,37] 
(data not shown). However, these early studies did not ad- 
dress the co-localization of different orthologs or the 
numbers of complexes that constitute a Pc body. Our 



attempts to provide such evidence were hampered by a 
lack of signal for most of the antibodies. 

The situation in HFs is in stark contrast to the evi- 
dence from human tumor cell lines where the PRCl 
proteins are typically found in a small number of very 
prominent nuclear bodies associated with pericentro- 
meric heterochromatin [36,38-40]. Although multiple 
PRCl orthologs are co-localized in these large aggregates 
[12,41-43], their functional significance has not been 
established. 

Another striking feature of the ChlP-seq data is that the 
binding profiles have a complicated topography, often 
with underlying periodicity, that is reproduced with differ- 
ent antibodies, with different preparations of chromatin, 
and in different cell backgrounds. The baroque nature of 
the peaks makes it difficult to generalize about their loca- 
tion relative to candidate target genes. Based simply on 
proximity to the nearest TSS, the ChlP-seq data suggested 
that PRCl complexes are co-localized at approximately 
1,000 loci in each strain of HFs. However, the distances 
can be very variable and the patterns do not conform to 
the idea that PRCl proteins bind within a few kilobases ei- 
ther side of the TSS, as suggested by meta-analyses of the 
profiles in ES cells [15,16,24,44]. 

Focusing on the candidate target genes, we found that a 
quarter of them are actively transcribed despite the pres- 
ence of PRCl in the adjacent DNA. Such observations are 
not without precedent but PcG complexes are generally 
viewed as obligate transcriptional repressors [3]. It is for- 
mally possible that PRCl complexes are present on only 
one allele allowing transcription to occur from the other 
but another possibility is that the assemblies of PRCl 
complexes that we observe are involved in 3D interactions 
that influence access to both positive and negative tran- 
scription factors. 

Although there were many parallels regarding the posi- 
tioning of PRCl complexes and their effects on gene ex- 
pression in the two strains of HF, there were several 
hundred loci at which the binding profiles were com- 
pletely distinct. Fibroblasts are often considered as a 
homogeneous cell type but different isolates have different 
growth rates and sensitivity to senescence [45,46] . Import- 
antly, HFs from different anatomical sites have character- 
istic patterns of gene expression that reflect their position 
relative to the developmental axes [47,48]. In this regard. 
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Figure 7 Genomic landscape of Polycomb orthologs is preserved at senescence. (A) Staining for senescence-associated (3-galactosidase 
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the BF cells would be considered anterior/trunk/non-der- 
mal whereas Hs68 cells would be posterior/distal/dermal 
[48]. As stromal fibroblasts are thought to influence tissue 
architecture, via mesenchymal-epithelial interactions, it is 
interesting to consider whether the signature patterns of 
PRCl occupancy represent an epigenetic ZIP-code'. 

Senescent fibroblasts also acquire characteristic pat- 
terns of gene expression coupled to a plethora of 



phenotypic and physical alterations [4,5]. At the chroma- 
tin level, senescence is accompanied by the formation of 
senescence-associated heterochromatin foci (SAHFs) 
that comprise non-overlapping layers of H3K9me3 and 
H3K27me3 modified chromatin [49-51]. However, a re- 
cent study suggested that SAHF formation represents a 
spatial repositioning of existing marks rather than 
changes in the landscape of histone modifications in the 
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genome [51]. Consistent with this report, the CBX bind- 
ing profiles that we observed in proliferating HFs were 
preserved at senescence and we did not see selective 
gains or losses of PRCl peaks that would be comparable 
to the striking differences between HF strains. Taken to- 
gether, our findings suggest that the effects of PRCl 
complexes on gene expression are not simply dictated 
by the presence or absence of PRCl at the locus. 

Conclusions 

Our findings question the prevailing view that different 
variants of PRCl act on different sets of target genes 
and suggest that they act collectively rather than indi- 
vidually. Different types of fibroblast have distinctive 
PRCl binding landscapes but the landscapes are pre- 
served at senescence. Collectively, our findings suggest 
that PRCl complexes have more subtle effects on gene 
expression than simply blocking transcription of the 
genes to which they bind. 

Materials and methods 

Cell cultures and immunoblotting 

The procedures for propagation of primary human fibro- 
blasts, staining for senescence-associated p-galactosidase 
activity and immunoblotting were as described [30,46,52]. 

Chromatin immunoprecipitation 

Chip assays were performed as described previously 
[53]. After sonication to obtain chromatin fragments of 
between 200 and 1000 bp, solubilized chromatin was di- 
luted to 1 [ig/[il and incubated with the appropriate anti- 
body at 4°C overnight. The antibodies are listed in 
Additional file 6: Table S2 and either species-matched 
irrelevant antibodies or pre-immune serum was used as 
the control as appropriate. After reversal of the crosslinks, 
the immunoprecipitated DNA was quantified by real- 
time PCR with the primer sets described in Additional 
file 6: Table S3. For sequential ChIP experiments, the 
eluted chromatin was divided into equal fractions, diluted 
tenfold in a dilution buffer (0.1% Na deoxycholate, 1% 
Triton X-100, 1 mM EDTA (ethylenediaminetetraacetic 
acid), 50 mM HEPES (4-(2-Hydroxyethyl)piperazine-l- 
ethanesulfonic acid, N-(2-Hydroxyethyl)piperazine-N'-(2- 
ethanesulfonic acid) pH 7.9, 150 mM NaCl) and subjected 
to a further round of ChIP with either the same or a dif- 
ferent antibody. 

Chromatin immunoprecipitation sequencing and 
bioinformatic analyses 

To generate sufficient quantities of DNA for sequence 
analyses, parallel ChIP reactions were performed using ap- 
proximately 5 [ig of antibodies with 500 [ig chromatin. 
The recovered material was pooled and concentrated to a 
minimum of 0.2 [ig/[il DNA samples were end repaired. 



poly-A tailed and Illumina single-end adapters were li- 
gated following the standard Illumina protocol with minor 
adjustments. Agencourt AMPure XP beads at 0.8 x ratio 
were used to size-select out adapter dimers after adapter 
ligation. The Phusion enzyme in the Illumina kit was re- 
placed by the Kapa HiPi HotStart ready mix. Post PCR, 
AMPure XP beads were used at a 1:1 ratio to maintain 
size integrity and to allow use of the Invitrogen SizeSelect 
E-gel system. We found that running the PCR before the 
gel improved visualization of the product and isolation of 
the correct band. Samples were finally purified with QIA- 
quick gel extraction kit (Qiagen Ltd) and quality con- 
trolled on the DNA 1000 Bio Analyzer 2100 chip before 
clustering. Input DNA was used as a control for the ChlP- 
seq analysis. Detailed protocols for library preparation and 
genome-wide sequencing are available online [54-59]. 

The alignments were performed using Novoalign (version 
2.07.14) [60] allowing for a single mismatch per read. Du- 
plicates reads were removed using the Picard MarkDupli- 
cates program (picard- tools package version 1.48) [61] and 
peak calling was performed using MACS (version 1.4.0rc2) 
[29]. The raw and processed data have been deposited 
under GEO accession number [GEO:GSE40740]. 

RNA-sequencing, quantitative reverse transcription and PCR 

Total RNA was extracted using the High Pure RNA ex- 
traction Kit (Roche). RNA samples were quality con- 
trolled (QC) using the 6000 Nano RNA Chip on a 
Bio Analyzer 2100 (Agilent) and subjected to poly-A se- 
lection using Sera-Mag oligo dT beads (Thermo Fisher 
Scientific Inc). Libraries were prepared using the Direc- 
tional mRNA-Seq Library Prep. vl.O. Pre-Release Proto- 
col from Illumina with minor adjustments [62]. The 
Phusion enzyme was replaced by the Kapa HiPi HotStart 
ready mix which reduced the overall volume of the PCR 
and the ratio for the Agencourt AMPure XP beads was 
adjusted accordingly. The standard PCR cycling was also 
changed to match the concentration of the total RNA 
from the initial QC. After passing the final QC, the li- 
braries were subjected to cluster formation and then 72- 
bp single-end sequencing on a GAIIx analyzer [62]. 

Sequenced reads were aligned using RSEM (version 
1.2.4) [63]. Each lane of the GAIIx produced approximately 
9 to 20 million unique reads that mapped to RefSeq genes 
archived in the Illumina iGenomes resource [64]. 

For qRT-PCR validation, cDNAs were generated from 
0.5 to 1 [ig of RNA using MultiScribe reverse transcript- 
ase and random hexamer primers (Applied Biosystems). 
Of the cDNA, 1/50 was used as a template for quantita- 
tive PCR with POWER SybrGreen (Applied Biosystems). 
RPS17 was used as an internal standard. Relative RNA 
levels were presented either as normalized to RPS17, or 
as fold-difference between the normalized values. Primer 
sequences can be found in Additional file 6: Table S4. 
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HPH3. (C) Similar evaluation of antibodies against RING1 and RING2. 
(D) Lysate from cells expressing Flag-tagged CBX4 (left) or CBX6 (right) 
were immunoprecipitated with antibodies against CBX4, 6, 7 and 8 as 
indicated. Following SDS-PAGE, the Flag-tagged proteins were identified by 
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Additional file 2: Figure S2. ChlP-PCR analyses of multiple PRC1 
proteins at representative loci. Each dataset includes a screenshot of the 
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suspected target gene. The primer sequences are described in Additional 
file 6: Table S3. The panels show the enrichment observed with the 
indicated antibody at each primer set as a percentage of input. Grey bars 
show values for a control IgG antibody. (A) GATA6 in BE cells, (B) CCND2 
in BE cells, (C) MEIS1 in BE cells and (D) NRN1 in Hs68 cells. 
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at the locus for the BE and Hs68 strains. TRUE and FALSE indicate 
whether the locus is transcriptionally active in BE and Hs68, as judged by 
RNA-sequencing. 

Additional file 4: Figure S3. ChlP-PCR showing differential binding of 
PRC1 proteins in BE and Hs68 cells. Each dataset includes a screenshot of 
the CBX7 binding profile across the locus (top), with a diagram showing 
the position of the PGR primer sets relative to the organization of the 
suspected target gene. The primer sequences are described in Additional 
file 6: Table S3. The panels show the enrichment observed with the 
indicated antibody at each primer set as a percentage of input. 
Grey bars show values for a control IgG antibody. (A) TBX2, (B) TBX4 
and (C) RUNX3. 

Additional file 5: Figure S4. ChlP-seq and RNA-sequencing profiles of 
the HOX clusters in BE and Hs68 cells. Upper tracks in each figure show 
the profiles of DNA sequence tag densities following ChlP-seq with 
antibodies against CBX6, CBX7, CBX8, H3K27me3 and H3K4me3 in the BE 
and Hs68 strains of HDF as indicated. The lower tracks show duplicate 
RNA-sequencing data for the corresponding genomic regions in either 
BE or Hs68 cells. The tag densities were normalized to the same 
maximum (numbers on left). 

Additional file 6: Table S2. List of antibodies used for ChIP. Table S3. 
List of oligonucleotide primers for PGR analysis of immunoprecipitated 
chromatin. Table S4. List of oligonucleotide primers used to assess RNA 
levels by reverse transcription and quantitative PGR. 
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